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Abstract 

We present Kovalenko's full-rank limit as a tight lower bound for decoding error probability of 
LDPC codes and LT codes over BEC. From the limit, we derive a full-rank overhead as a lower bound 
for stable overheads for successful maximum-likelihood decoding of the codes. 

I. Introduction and Backgrounds 

Binary Erasure Channels (BEC) based Low-Density Parity-Check (LDPC) codes [4], [5] and 
Luby Transform (LT) codes [6], [7] became quite popular for a variety of applications over 
packet networks such as the Internet. The popularity of LDPC and LT codes are due in part to 
(a) the low-complexity of the popular set of decoding algorithms that fall under the umbrella 
of the Message Passing Algorithm (MPA) (otherwise called Belief Propagation Algorithm for 
BEC) [4], [5], (b) good error performance of MPA for codes of large block lengths, and (c) the 
flexibility in choosing the block lengths of these codes, which make them usable for a variety 
of applications. 
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In BEC, without loss of generality, the task of both LT and LDPC decoders is to recover the 
unique solution of a consistent linear system 

HX T = f3 T } p = (p 1 ,...,p m )eQr 2 ) m , OLD 

where H is an m x n matrix over F 2 . This can be explained shortly as follows. In case of LT codes, 
to communicate an information symbol vector a = (ai, . . . ,a n ) G (Ff) n , a sender constantly 
generates and transmits a syndrome symbol $ = HiOt T over BEC, where Hi G is generated 
uniformly at random on the fly by using the Robust Soliton Distribution jx{x) = Yl l Ji dX d (see 
[6]). A receiver then acquires a set of pairs {(H it , /3jJ}£Li and interprets it as System (|T7Th . 
Hence, the variable vector X = (x%, . . . ,x n ) G (Ff)™ in the system represents the information 
symbol vector a. In case of LDPC codes, contrastingly, a sender transmits a codeword vector 
a = . . . , «jv) in Ker(M) = {a G (Ff)-^ | M ■ a T = 0}, where M is an m x iV binary check 
matrix. Due to erasures, some of symbols of a may be lost and a receiver acquires a part of 
a, denote it as a. Then by the rearrangements a = (a,X) and M = [H;H], where H and H 
consist of columns of M that associate symbols of a and X, respectively, the receiver interprets 
the kernel space constraint M ■ a T = as System (11.11) . where (3 T = Ha T . Hence in LDPC 
codes, X represents a lost symbol vector of a. 

In LT codes, the column-dimension n of H is fixed, the row-dimension m of H is a variable, 
and a reception overhead 7 = 222 ^ is the key parameter for measuring error-performance of 
codes. In LPDC codes, however, the row-dimension m is fixed in general, the column-dimension 
n = pN is a variable, and a erasure rate (or loss rate) p = is the key parameter for measuring 
error-performance of codes. Let R — 1— ?l, a code-rate of an LDPC code. By using m = (1+7)71, 
n = piV, and R — 1 — ^, p and 7 are expressed as 

l-R m - n 1 - (R + p) 
p = — — and 7 = = . (1.2) 

1 + 7 n p 

Like LT codes, thus, the error-performance of LDPC codes can be also measured in terms of 7. 

Several literatures showed the existence of capacity approaching LDPC codes [9] and optimal 
LT codes [6], [7], whose minimal overheads for successful decoding by the MPA in high 
probability tends to zero as block lengths (n for LT and iV for LDPC codes) increase to infinity. 
For codes of short block lengths, however, their minimum overheads (for the successful decoding 
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by the MPA in high probability) are not close to zero. Furthermore, even for a nontrivial 7 > 0, 
the full-rank probability Pr(Rank(iJ) = n) is not very close to 1. 

System (II.lt has its unique solution, iff, Rank(H) = n the full rank of H. In case of the 
full-rank, the unique solution can be recovered by using a Maximum-Likelihood Decoding 
Algorithm (MLDA) such as the ones in [5], [10]— [12]. These algorithms are an efficient Gaussian 
Elimination (GE) that fully utilize an approximate lower triangulation of H, which is obtainable 
by using the diagonal extension process with various greedy algorithms [4], [5]. Under those 
GE based MLDAs, thus, the probability of decoding success is precisely the Pr(Rank(/J) = n). 
Let us define the Decoding Error Probability (DEP) of a code as the rank-deficient probability 

P%1{1 + 7) n) = 1 - Pr(Rank(tf) = n), (1.3) 

where H is an m x n decoder matrix of system (11711) with 7 = m ^ L - Assume that -P£f£(l + 7, n) 
is a decreasing function with respect to 7. Then for a given error-bound (or deficiency bound) 
< S < 1, define 

7.0*, n) = miri{7 | P££ (1 + 7, n) < 6}, (1.4) 

7>0 

and refer to as the Minimum Stable Overhead (MSO) of a code within the error-bound 5. Since 
PmlO- + 7; n ) ls decreasing, we may expect that -Pa££(1 + r y,n) < 5 for any 7 > 7* (5, n). Thus, 
the key part of designing codes is to identify lower bounds of DEP and MSO then to obtain the 
codes whose DEP and MSO are close to the bounds. 

In this paper, as the main contribution of this paper, we define Kovalenko's Full-Rank Limit 
(KFRL), denote as K(l + 7, n), from Kovalenko's rank-distribution of binary random matrices 
[l]-[3], and show that it is a probabilistic lower bound for -P^£(l + 7, n), i.e., K(l + 7, n) < 
PmlO- + 7; n ) f° r an y 7 an d n - We then derive Kovalenko's Full-Rank Overhead (KFRO) from 
KFRL, denote as 7^(5, n), as a lower-bound for MSO, i.e., 7x(<5, n) < y*(6,n) for any 5 and 
n, and show that the overhead 7^(5, n) tells the least number of symbols that a receiver should 
acquire to achieve P^l(1 + 7, n) < 5. We also provide experimental evidences which show the 
viability that, given a destined error-bound <5 , both LT and LDPC codes may be designed to 
achieve their error-performances in P^l(1 + 7, n) and 7* (5, n) that are close to K(l + 7, n) 
and 7k(5, n) for 5 > 5 , respectively, by supplementing enough number of dense rows to H of 
system (II.lt . 
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The remainder of this paper is composed of as follows. In Section HH we define KFRL and 
KFRO and verify them as lower bounds for DEP and MSO of LDPC and LT codes. In Section Hill 
we present experimental results of the performances of codes in terms of DEP and overhead. 
We summarize the paper in Section [IV] 

II. KOVALENKO'S FULL-RANK LIMIT AND OVERHEADS 

Let us first clarify terms and notations for the remainder of this section. Let \Hi\ denote the 
number of nonzero entries of a row Hi of H and refer to as the degree of Hi. Given an overhead 
7, we shall assume that ^n = k for some integer k > 0. Let H denote an m x n random binary 
matrix over F 2 that consists of random rows Hi = (ha, . . . , hi n ) for 1 < i < m, such that 
Pr(hij — 1) = | for 1 < j < n. Finally, let t; k (n — s) = Pr(Rank(i7) = n — s) the probability 
that Rank(if) = n — s, where k = m — n (or k = ^n). 

Let us introduce Kovaleko's rank-distribution of H. It is shown in [l]-[3] by Kovalenko that, 
for any fixed integers k and s with / = k + s > 0, 



i=s+l 



where 



ii=0 i2=i\ i;=i;_i 



Since lim n ^oo S{n — s, I) — Yli=x{l ~ ^) 1 » it holds that 

lim Un -s)= i- , X U Z + r^- (II.3) 

In fact, the limit distribution above still holds when entries of H meet the density constraint 

Hn)+X < Pr(^ ^ 0) < 1 - HU)+X , (H.4) 
n n 

where x — > oo arbitrarily slowly. The limit distribution, however, is not directly applicable to H 
in System (ILTI) . because entries of i/ may not follow the constraint (111.41) . 

In the following, we define KFRL and verify it as a lower bound for 1— £fc(n) = Pr(Rank(i7) < 
n). We then define KFRO from KFRL and verify it as a lower bound for MSO. Foremost, notice 
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that the sequence {S(n — s, l)}%L a is in fact increasing, therefore 

k+s 



/ 1 \ 

S(n-s,l) < KmS(n-s,l) = l[(l--) . (II.5) 

i=l ^ ' 

By Plugging in s = into (III.lt and (|II.5t . we have 

,-_b 1 1 \ / 



(II.6) 



i=fc+l 

With the left-hand side above, where k = jn, define 



jr(l + 7 ,n) = l- J] (l-^), (H.7) 

S v ' 

g(k,n) 

and refer to as KFRL. For a given error-bound 5 now, define 

lK {8, n) = min{ 7 | K(l + 7, n) < 5}, (II.8) 

7>0 

and refer to as the KFRO with 5. Notice that KFRL is decreasing with respect to 7, and thus, 
fT(l + 7,n) < 5 for any 7 > r y K (5,n). Observe from (|II.7t that g(k + l,n) = (l — t^tt) g(k,n). 
Hence by g(0,n) := 0.288788095066 for n > 50, + 7, n) can be computed explicitly by 
(III. 81) . and consequently, 7^(5, n) is obtainable from the graph of K(l + 7, n). 

The following proposition shall be conveniently used for upper bounds for K{1 + 7, n) and 
7k (5, n), and for the proof of Lemma III. 11 

Proposition ILL Let V = (vi,...,v n ) e F£ &e gz'ven wzY/z |V| = k > 0, an J let W = 

(wi, . . . , w n ) G ¥2 be a random vector such that Pr(wj = 1) = - for 1 < i < n. Then 

1 + (1 - M)k 

Pr(W ■ V T = 0) = , (II.9) 

where W ■ V T = Yli=i w i v -i over ^2- 

Proof: From binomial expansions, we have 

w;w- +(-+»>', ( „.io) 

s even ^ ' 

Let pi = Pr(u7j = 1) for 1 < i < n. Since |V| = fc, assume without loss of generality that Vi — 1 
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for 1 < i < k and v { = for k + 1 < % < n, so that Pr(W ■ V T = 0) = Pr (Ef=i w i = °)- 
Then since 5^ i=1 = iff. = 1 for even number of i's, 

Pr |^ Wi =oj = j2 (J) n^n (i_p) (n - n) 

\i=l / seven ^ ' iG/ s i^/ s 

where J s C {1, 2, . . . , k} with |/ s | = s. Hence by pi = - for 1 < i < n, we have 

s even x ' x ' x / 

Taking a=^ and 6=1-^ into dlTTOl) verifies (1x191) . ■ 

Theorem II.l (Upper-Bound for 7^(5, n)). For a given error-bound 5, let k$ > be an integer 
such that 



n \ n / n 



i.e., ks = min{A; G Z | 2 < 5}. then follows that 

M < l±^m . (,,,4) 

n 

Proof: Let be an m x n binary random matrix with m = n + k$ such that, for each row 
= (ha, hin), P*(hij = 1) = | for 1 < j < n. By Proposition |ILll Pr(#, • V T = 0) = § 
for 1 < i < n and V 7^ 0. Then since each ifj is independent of all other rows, 



m 

Pi(V G Ker(#)) = JI Pr (^ • V T = 0) = — . (11.15) 



i=i 



Note that Rank(if) < n iff. H ■ V = for some V 7^ 0, and there are of total 2™ — 1 nonzero 



vectors in F£. Therefore, 



1 - £*(n) < £ Pr (V G Ker(tf)) < -^-^ < 5. (11.16) 

vvo 

Hence by (111.61) . if (1+7,5, n) < <5, and by the definition of 7^ (5, n), 7^(5, n) < 75. The inequality 
(III. 141) is then clear by (III. 131) . ■ 
Although the authors of the paper are not able to provide any mathematical proofs, experiments 
exhibited that K{1 + 7, n) and 2~ 7n are almost identical as 5 decreases. Hence 7^(5, n) is in 
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fact very close to 75 = Notice that, since Hindoo 1+log ^( 1 / s ) = as long as 5 > 2~™ c for 
c < 1, lim^oo 7i<-(5, n) = for such 5 by Theorem III. 11 

In the following lemma, we show that K(l + 7, n) < Pml{1 + As a consequence of 

the lemma, we show in Theorem 111.21 that 7k(<!>, n) < 7* (5, n). 

Lemma II. 1 (KFRL as a lower-bound for DEP). Let H be an m x n matrix of System dTTil) . 
where m — (1 + 7)71 wzf/z 7 > 0. Then 

tf(l + 7,n)<P££(l+ 7 ,n). (11.17) 

Proof: Let fc = 7n, m = (1 + 7)71, and H an m x n binary random matrix such that 
Fx{hij = 1) = |. We first show that 

Pr(Rank(F) = n) < Pr(Rank(F) = n). (11.18) 

In LT codes, each row Hi of if in system dTTTb follows the uniform probability Pr(/ijj = 1) = - 
with d < f , where d = |iij| with probability fi d of the RSD //(a;) = J2f i d,x d - In LDPC codes, 
H of system (|I1I) is formed by randomly chosen n = pN columns of the check matrix M. In 
both LT and LDPC codes, thus, Pr(/ij j = 1) < ± for 1 < j < n. Then by Proposition ID] 
Pr(i^ • V T = 0) < Pr(^ • \/ T = 0) for V G F£, and this is true for every 1 < i < m. Therefore, 
Pi(H ■ V T = 0) < Pr(H ■ V T = 0), and in expectation sense, |Ker(#)| < \Ker(H)\, and hence, 
the inequality (III. 181) is verified. The inequality (III. 171) is then clear by the lower bound in (III. 61) . 

■ 

Theorem II.2 (KFRO as a lower-bound for MSO). To solve system dTTTl ) uniquely with a destined 
bound -Pa/£(1 + 7> n ) — ^ should hold that 

l*{6,n)> lK {6,n). (11.19) 

To achieve P^ r L {l + 7,72) < 8, therefore, the numbers of symbols that receivers should acquire 
is at least (1 + 7^(5, n))n for LT codes, and j^^^^ N for LDPC codes. 

Proof: The inequality (III. 191) is clear by Lemma III. II and by the definitions of 7*(<5, n) 
and 7x(<5, n) in (|L4l) and (|II.8I) . respectively. To achieve Pm L (1 + 7, n) < 5 with LT codes, the 
inequality (III. 191 ) implies that the number of symbols of (3, equivalently, the row-dimension m 
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of H in System dl.lt , should be at least (1 + j K (8,n))n. In case of LDPC codes, recall that 
m = (1 - R)N and n = pN. To achieve Pjg£(l + 7, n) < 5 with LDPC codes, hence, (iTT7T9l) 
implies that m > (1 + 7r:(#, n))pN. In other words, the number of lost symbols, equivalently 
the column-dimension of H in System dTTTT) that is n = pN, should be at most jjpj^^n where 
(1 — R)N = m. Therefore, the number of acquired symbols by receivers, i.e., (1 — p)N, should 
be at least M&$±* N. ■ 

Example 11.1. Red curves in Fig. [7] represent the KFRL K(l + 7, n), where n = 100 /or 
codes (top) an J n = p200 for LDPC codes (bottom) with < p < |. Wnen 5 = 10~ 4 , /or 
an example, 1 + 7i<-(10 _4 ,n) « 1.14 in both LT and LDPC codes. To verify 1.14 with LDPC 
codes, use the conversions in d/.2l) with pk ~ 0.43 in the bottom figure. This implies that by 
Lemma IIZD since K(l +7, n) > 10~ 4 /or 1 +7 < 1.14, the DEP of both LT and LDPC codes 
can not be better than 10 -4 , i.e., Pml(1 +7,n) > 10 -4 for 7 < 0.14. Again by Theorem |IL2] 
to achieve P|J£(1 + 7, 100) < 10~ 4 with LT codes, the minimum overhead 7*(10~ 4 , 100) should 
be larger than 0.14, i.e., 7*(10~ 4 ,n) > 0.14. Analogously, to achieve Pml{^- + 7; n ) — 10~ 4 
with the LDPC codes, where n = p200, the maximum tolerable loss rate p* = 1+7 M=i n ^ (use 
the conversion in d/.2l)). should be less than px = i+ 7K (io-4 n ) ~ 0.43, i.e., p* < 0.43. 

Another thing should be noticed is that, as mentioned earlier, the two curves K(l +7, n) and 
2 _7n in the top figure are almost identical as 5 decreases. In this respect, 7^-(10 _4 , 100) ~ 
where k$ is the smallest integer k such that 2~ h < 10~ 4 . It is not hard to see by direct computation 
that kg = 14 for 5 = 10~ 4 and 75 ~ ^ = 0.14, that is precisely the 7x(10~ 4 , 100). 

III. Experimental Results with LT and LDPC Codes 

In this section, we provide experimental results which show the viability that both LT and 
LDPC codes may achieve the error-performances in P^£(l + 7,71) and 7* (5, n) that are close 
to K(l + 7, n) and jk{§, n), respectively, when enough number of dense rows or columns are 
supplemented to H in System (IT7TT) . Codes for experiments are arranged as following. For LDPC 
codes, two check matrices of block dimension 100 x 200 (thus R = |), say M and M, were 
arranged by using PEG algorithm in [8]: M was generated with the column-degree distribution 
p[x) in Table [J and M was generated by supplementing 15 random rows of degree y = 100 
to a check matrix of dimension 85 x 200 arranged with p(x). For LT codes, two row-degree 
distributions p(x) and fl(x) in TABLE[T]were used for constructing codes of block length n = 100. 
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fj,(x) 


(fid)d=i = (0.012,0.482,0.153,0.082,0.047) 
(fid)** = (0.035,0.024,0.023,0.012,0.012) 
/i 25 = 0.059, = 0.059 


p,(x) 


Normalization of fi(x) + (0.17)x 5U 


p{x) 


(p d )»_ 2 = (0.46,0.32,0.021,0.06,0.04,0.025) 
p 9 = 0.01,p 19 = 0.02,p 20 = 0.05 



TABLE I 

The ROW-DEGREE DISTRIBUTIONS fl(x) AND p,(x) FOR LT codes (top), and the column-degree DISTRIBUTIONS p(x) 

FOR LDPC CODES (BOTTOM). 



In Fig.[Q curves represent K(l+j, n)'s (red ones) and P|J£(l+7, n)'s of LT and LDPC codes 
(blue and black ones), where n = 100 for LT and n = p200 for LDPC codes with < p < 0.5. 
At each point of the DEP curves, the value of +7, n) is estimated by the fraction of 

the number of rank-deficient cases of m x n matrices H with m — (1 + j)n (or the fraction of 
decoding failure cases of system (11.11) ) based on more than 10 6 random constructions of (H,f3) 
of system (|I.1I) . The the Separated MLDA in [11], [12] was used to check the rank-deficiency. 

It can be clearly seen from the figure that, when check matrices of codes are constructed with 
fj,(x) and p(x) that have no dense fractions (i.e. /i 50 = P100 — 0), their DEP (black ones) never 
drop to the error-bounds, 5 = 10~ 2 with LT codes and 5 = 10~ 3 with LDPC codes. These error- 
flooring phenomena are obviously due to the deficient cases of H, i.e., 77 = dimKer(if) > 
that occur sporadically for large 7. Most of the deficient cases, however, 77 is merely one or 
two for large 7 (small p for LDPC codes). This small deficiency can be readily removed by 
supplementing a fraction of dense rows. To improve their DEP, we altered fx(x) of the LT code 
into jl(x) by supplementing the dense fraction = 0.17 (thus /2 50 ~ 0.15), and the check 
matrix M was redesigned to M by supplementing 15 random rows of degree 100 as stated 
before. Thus, H in system (|I.1I) by p(x) and M can have enough number of dense rows. By 
doing so, the altered codes achieved their DEP curves (blue ones) and MSO 7*(5, n) that are 
close to the lower bounds KFRL and KFRO for 5 < 10 -4 , respectively. 

It is interesting to note that K(l +7, n) is very close to 2~ 7n for small 5. In this case, 7^(5, n) 
can be understood as the integer k$ such that log 2 (l/5) < kg < l+log 2 (l/<5), i.e., 7^(5, n) := ^. 

Although we do not present experimental evidences, supplementing about 15 percent of dense 



January 13, 2009 



DRAFT 



2ND DRAFT 



10 



rows to H of system (iLTt does not degrade the computational complexity of solving system dl.lt 
seriously. For an example, with the LT codes generated by the fi(x), the number of symbol 
additions on [3 of system (UTTT) to compute the solution of the system under the Separated MLDA 
is within 1, 100 (that is lln). Similarly with the LDPC codes by M, the number of symbol 
addition on (3 is within 1, 600 (that is 8 AT). 

IV. Summary 

We presented that Kolvalenko's full-rank limit and its overhead are tight lower bounds for 
decoding error probability and minimum stable overheads, respectively, of LT and LDPC codes. 
We also provided experimental evidences which show the viability that, when enough number 
of dense rows are supplemented to check matrices, both LT and LDPC codes may achieve the 
code performances in decoding error probability and minimum stable overheads that are close 
to Kovalenko's full-rank limit and its overhead, respectively. 
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Error-Performances of LT codes by fi(x) and [i(x) with n = 100 




1 1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16 1.18 1.2 1.22 1.24 1.26 1.28 1.3 

1+y: 1+Overhead 



Error Performances of LDPC codes by M and M for N = 200 and R = 0.5 




p: Erasure Rate 



Fig. 1. Top figure shows the error-performance of LT codes by i-i(x) (black) and fJ,(x) (blue) in DEP vs. overhead. Bottom 
figure shows the error-performance of LDPC codes by M (black) and M (blue) in DEP vs. erasure rate, where p — j-r^-- 
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